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O (57) Abstract: This invention prides .netnods for attaching « -^^^^^t^^C^^ 

O detecting the identity of each nucleotide anaiogue ^J^^^SZ IZ^^^^ » the " UC,e0tide 



MASSIVE PARALLEL METHOD FOR DECODING DNA AND RNA 



This application claims the benefit of U.S. Provisional 
Application No. 60/300,894, filed June 26, 2001, and is 
a continuation-in-part of U.S. Serial No. 09/684,670, 
filed October 6, 2000, the contents of both of which are 
hereby incorporated by reference in their entireties 
into this application. 

Background Of The Invention 

Throughout this application, various publications are 
referenced in parentheses by author and year. Full 
citations for these references may be found at the end 
of the specification immediately preceding the claims. 
The disclosures of these publications in their 
entireties are hereby incorporated by reference into 
this application to more fully describe the state of the 
art to which this invention pertains. 

The ability to sequence deoxyribonucleic acid (DNA) 
accurately and rapidly is revolutionizing biology and 
medicine. The confluence of the massive Human Genome 
Project is driving an exponential growth in the 
development of high throughput genetic analysis 
technologies. This rapid technological development 
involving chemistry, engineering, biology, and computer 
science makes it possible to move from studying single 
genes at a time to analyzing and comparing entire 
genomes . 
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heterozygotes unambiguously and are not 100% accurate in 
regions rich in nucleotides comprising guanine or 
cytosine due to compressions (Bowling et al . 1991; 
Yamakawa et al . 1997). In addition, the first few bases 
after the priming site are often masked by the high 
fluorescence signal from excess dye-labeled primers or 
dye-labeled terminators, and are therefore difficult to 
identify. Therefore, the requirement of electrophoresis 
for DNA sequencing is still the bottleneck for high- 
throughput DNA sequencing and mutation detection 
projects . 

The concept of sequencing DNA by synthesis without using 
electrophoresis was first revealed in 1988 (Hyman, 1988) 
and involves detecting the identity of each nucleotide 
as it is incorporated into the growing strand of DNA in 
a polymerase reaction. Such- a scheme coupled with the 
chip format and laser-induced fluorescent detection has 
the potential to markedly increase the throughput of DNA 
sequencing projects. Consequently, several groups have 
investigated such a system with an aim to construct an 
ultra high-throughput DNA sequencing procedure 
(Cheeseman 1.994, Metzker. et al.^1994). Thus far, no 
complete success of using such a system to unambiguously 
sequence DNA has been reported. The pyrosequencing 

, .p^,,,- natural nucleotides 

approach that employs four naturax 

(comprising a base of adenine (A), cytosine (C) , guanine 
{G ), or thymine (T) ) and several other enzymes for 
sequencing DNA by synthesis is now widely used for 

mutation detection (Ronaghi 1998). In this approach, 

the detection is "based on the pyrophosphate (PPi) 
released during the DNA polymerase reaction, the 
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position of the deoxyribose ring in ddCTP is very 
crowded, while there is ample space for modification on 
the 5-position the cytidine base. 

The approach disclosed in the present application is to 
make nucleotide analogues by linking a unique label such 
as a fluorescent dye or a mass tag through a cleavable 
linker to the nucleotide base or an analogue of the 
nucleotide base, such as to the 5-position of the 
pyrimidines (T and C) and to the 7-position of the 
purines (G and A), to use a small cleavable chemical 
moiety to cap the 3 ' -OH group of the deoxyribose to make 
it nonreactive, and to incorporate the nucleotide 
analogues into the growing DNA strand as terminators. 
Detection of the unique label will yield the sequence 
identity of the nucleotide. Upon removing the label and 
the 3 1 -OH capping group, the polymerase reaction will 
proceed to incorporate the next nucleotide analogue and 
detect the next base. 

It is also desirable to use a photocleavable group to 
cap the 3' -OH group. However, a photocleavable group is 
-- generally -fculky and- thus the" DNA polymerase will have 
difficulty to incorporate the nucleotide analogues 
containing a photocleavable moiety capping the 3 ' -OH 
group. If small chemical moieties that can be easily 
cleaved chemically with high yield can be used to cap 
the 3' -OH group, such nucleotide analogues should also 
be recognized as substrates for DNA polymerase. It has 
been, _ rep.orte.d-, that 3 ' -O-met.hoxy-deoxynucleotides are. 
good substrates for several polymerases (Axelrod et al. 
1978). 3'-0-allyl-dATP was also shown to be 
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the laser induced fluorescence detection approach which 
have overlapping fluorescence emission spectra, leading 
to heterozygote detection difficulty, the MS approach 
disclosed in this application produces very high 
resolution of sequencing data by detecting the cleaved 
small mass tags instead of the long DNA fragment. This 
method also produces extremely fast separation in the 
time scale of microseconds. The high resolution allows 
accurate digital mutation and heterozygote detection. 
Another advantage of sequencing- with mass spectrometry 
by detecting the small mass tags is that the 

^ ,c 0 r>riahed with qel based systems are 
compressions associated w-n_u y<=->- 

completely eliminated. 

In order to maintain a continuous hybridized primer 
extension product with the template DNA, a primer that 
contains a stable loop to form an entity capable of 
self-priming in a polymerase reaction can be ligated to 
the 3' end of each single stranded DNA template that is 
immobilized on a solid surface such as a chip. This 
approach will solve the problem of washing off the 
growing extension products in each cycle. 

Saxon and Bertozzi (2000) developed an elegant and 
highly specific coupling chemistry linking a specific 
group that contains a phosphine moiety to an azido group 
on the surface of a biological cell. In the present 
application, this coupling chemistry is adopted t« 
create a solid surface which is coated with a covalentl 
linked - phos P hine. .moiety, and..... .to .. generate, polymer**, 

chain reaction (PCR) products that contain an azid 
group at the 5' end for specific coupling of the DNA 



o 

y 

azido 
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Summary Of The Invention 

This invention is directed to a method for sequencing a 
nucleic acid by detecting the identity of a nucleotide 
5 analogue after the nucleotide analogue is incorporated 

into a growing strand of DNA in a polymerase reaction, 

which comprises the following steps: 

(i) attaching a 5' end of the nucleic acid to a 
10 solid surface; 

(ii) attaching a primer to the nucleic acid 
attached to the solid surface; 
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(iii) adding a polymerase and one or more different 
nucleotide analogues to the nucleic acid to 
thereby incorporate a nucleotide analogue into 
the growing strand of DNA, wherein the 
incorporated nucleotide analogue terminates 
the polymerase reaction and wherein each 
different nucleotide analogue comprises (a) a 
base selected from the group consisting of 
adenine, guanine, cytosine, thymine, --and 
uracil, and their analogues; (b) a unique 
label attached through a cleavable linker to 
the base or to an analogue of the base; (c) a 
deoxyribose; and (d) a cleavable chemical 
group to cap an -OH group at a 3' -position of 
the deoxyribose; 



(iv) washing the solid surface to remove 
unincorporated nucleotide analogues; 




• 
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wherein if the unique label is a dye, the order of 
steps (v) through (vii) is: (v) , (vi) , and (vii) ; 
and 

wherein if the unique label is a mass tag, the 
order of steps (v) through (vii) is: (vi) , (vii), 
and (v) - 

The invention provides a method of attaching a nucleic 
acid to a solid surface which comprises: 

(i) coating the solid surface with a phosphine 
moiety, 



(ii) attaching an azi 
nucleic acid, and 



Ldo group to a 5' end of the 



(iii) immobilizing the 5' end of the nucleic acid 
to the solid surface through interaction 
between the phosphine moiety on the solid 
surface and the azido group on the 5' end of 
the nucleic acid. 



The 
comprises : 



invention provides a nucleotide analogue which 



(a) a base selected from the group consisting of 
adenine or an analogue of adenine, cytosine or 
an analogue of cytosine, guanine or an 
...... analogue- ,o^guanine r ~ thymine or an- -analogue of 

thymine, and uracil or an analogue of uracil; 
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Brief Description Of The Figures 



Figure 1: The 3D structure of the ternary complexes of 
rat DNA polymerase, a DNA template-primer, and 
dideoxycytidine triphosphate (ddCTP) . The left side of 
the illustration shows the mechanism for the addition of 
ddCTP and the right side of the illustration shows the 
active site of the polymerase. Note that the 3' 
position of the dideoxyribose ring is very crowded, 
while ample space is available at the 5 position of the 
cytidine base. 

Figure 2A-2B: Scheme of .sequencing by the synthesis 
approach. A: Example where the unique labels are dyes 
and the solid surface is a chip. B: Example where the 
unique labels are mass tags and the solid surface is 
channels etched into, a glass chip. A, C, G, T; 
nucleotide triphosphates comprising bases adenine, 
cytosine, guanine, and thymine; d, deoxy; dd, dideoxy; 
R, cleavable chemical group used to cap the -OH group; 
Y, cleavable linker. 

"Figure" 3:"'* The" synthetic scheme for the immobilization of 
an azido (N 3 ) labeled DNA fragment to a solid surface 
coated with a triarylphosphine moiety. Me, methyl group; 
p, phosphorus; Ph, phenyl. 

Figure 4: The synthesis of triarylphosphine N- 
hydroxysuccinimide (NHS) ester. 



Figure 5: The synthetic scheme for attaching an azido 
(N 3 ) group through a linker to the 5' end of a DNA 
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one, based on the complimentary template. The dye xs 
detected and cleaved to test the approach. Dyel = Fam; 
D ye2 = R6G; Dye3 = Tarn; Dye4 = Rox. 

Figure 10: The expected photocleavage products of DNA 
containing a photo-cleavable dye (Tarn). Light 
absorption (300 - 360 nm) by the aromatic 2-nitrobenzyl 
m oiety causes reduction of the 2-nitro group to a 
nitroso group and an oxygen insertion into the carbon- 
hydrogen bond located in the 2-position followed by 
cleavage and decarboxylation (Pillai 1980) . 

Figure 11: Synthesis of PC-LC-Biotin-FAM to evaluate the 
Photolysis efficiency of the fluorophore coupled wxth 
the photocleavable linker 2-nitrobenzyl group. 

Figure 12: Fluorescence spectra (^ = 480 nm) of SC-W- 
Biotin-FAM immobilized on a ^microscope glass slide 
coated with streptavidin (a); after 10 min photolysis 
( X irr = 350 nm; ~0.5 mW/cm 2 ) (b) ; and after washing wxth 
water to remove the photocleaved dye (c) . 

- Figure 13A-13B: Synthetic scheme for capping the 3' -OH 
of nucleotide- 

Figure 14: Chemical cleavage of the MOM group (top row) 
and . the allyl group (bottom row) to free the 3' -OH xn 
the nucleotide. CITMS = chlorotrimethylsilane . 

, ^ c.nernv transfer^" coupled ' dye 

Figure" 15A-15B: Examples of energy tw 

v- rw9 is employed as a light 
systems, where Fam or Cy2 xs empx y 

sz j„„ nr i and Cl2Fam, C1 2 R6G, 

absorber (energy transfer donor) ana 
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Figure 20: Example of synthesis of NHS ester of one 
ss tag (Tag-3) . A similar scheme is used to create 
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ma 

other mass tags 



Figure 21: A representative scheme for the synthesis of 

/-_ , a similar scheme is 
the nucleotide analogue 3 '-Ro-G-r ag 3 . a 

used to create the other three modified bases 3 .- R o-A- Tag i, 
tetrakis(triphenyl P hosphine)palladium(0) ; <xx> P0C1 3 , 

, t'AA\ NHHH- (iv) Na2C03/lSIaHC03 (P H " 

Bn 4 N + pyrophosphate; (m) NH4OH, (iv) a 2 

9.0) /DMSO. 

~i~o r>* pynected photocleavage products 
Figure 22: Examples or expecteu y 

, . . - nhni-nrleavable mass tag- 
of DNA containing a pnonociecivaux 

Figur. 23: System for DNA sequencing comprising 
multiple channels in parallel and multiple mass 

nr, n.rallel The example shows 96 channels 
spectrometers m parallel. ±ue f 

in a silica glass chip. 

Figure 24: Parallel mass spectrometry system for DNA 
sequencing. Example shows three mass spectrometers xn 
parallel. Samples are injected into the ion source 
where they are mixed with a nebulizer gas and ionized. A 
turbo pump is used to continuously sweep away free 
radicals, neutral compounds and other undesirable 

; ^ cnnrrP A second turbo 
elements coming, from the xon source. a 

_ , „ _ nnntinuous vacuum in all 
pump is used to generate a contxnuo 

three • analyzers and ■ detectors simultaneously- -Re- 
acquired signal is then converted to a digital signal by 
the A/D converter. All three signals are then sent to 
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Detailed Description Of The Invention 

The following definitions are presented as an aid in 
understanding this invention. 

As used herein, to cap an -OH group means to replace the 
«H" in the -OH group with a chemical group. As 
disclosed herein, the -OH group of the nucleotide 
analogue is capped with a cleavable chemical group. To 
uncap an -OH group means to cleave the chemical group 
from a capped -OH group and to replace the chemical 
group with «H", i.e., to replace the «R" in -OR with "H" 
wherein "R" is the chemical group used to cap the -OH 
group . 

The nucleotide bases are abbreviated as follows: adenine 
(A), cytosine (C) , guanine (G) , thymine (T) , and uracil 
(U) . 

An analogue of a nucleotide base refers to a structural 
and functional derivative of the base of a nucleotide 
which can be recognized by polymerase as a substrate. 
That is, for example, an analogue of adenine (A) should 
form hydrogen bonds with thymine (T) , a C analogue 
should form hydrogen bonds with G, a G analogue should 
form hydrogen bonds with C, and a T analogue should 
form hydrogen bonds with A, in a double helix format. 
Examples of analogues of nucleotide bases include, but 
are not limited to, 7-deaza-adenine and 7-deaza-guanine, 
wherein the _nitrogen- -atom ...aWthe ..7 -position of- -adenine- 
or guanine is substituted with a carbon atom. 
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*. * ~~ no Hirprted to a method for 
The present invention is directed 

• „ = „„ripir acid by detecting the identity of a 
sequencing a nucleic aciu uy 

nucleotide analogue after the nucleotide analogue is 
incorporated into a growing strand of DNA in a 
polymerase reaction, which comprises the following 

steps : 



(i) attaching a 5' e 
solid surface; 



nd of the nucleic acid to a 



(ii, attaching a primer to the nucleic acid 
attached to the solid surface; 

(iii) adding a polymerase and one or more different 
nucleotide analogues to the nucleic acid to 
thereby incorporate a nucleotide analogue into 
the growing strand of DNA, wherein the 
incorporated nucleotide analogue terminates 
the polymerase reaction and wherein each 
different nucleotide analogue comprises (a) a 
base selected from the group consisting of 
adenine, guanine, cytosine, thymine, and 
uracil", -and- their- analogues; (b) a unique 
label attached through a cleavable linker to 
the base or to an analogue of the base; (c) a 

^ fA\ a cleavable chemical 
deoxyribose; and (d) a cieavau 

A1J rtr -,, n =,+- = 3'— position of 
group to cap an -OH group at a j p 

the deoxyribose; 



-i j j cn vp-irp to remove - 

,.. .(i-v.) . -washing— the- •.•.solid- -s-u^ace 

unincorporated nucleotide analogues; 



wherein if the unique label is a mass tag, the 
order of steps (v) through (vii) is: (vi) , (vii) , 
and (v) . 

in one embodiment of any of the nucleotide analogues 
described herein, the nucleotide base is adenine. In one 
embodiment, the nucleotide base is guanine. In one 
embodiment, the nucleotide base is cytosine. In one 
embodiment, the nucleotide base is thymine. In one 
embodiment, the nucleotide base is uracil. In one 
embodiment, the nucleotide base is an analogue of 
adenine. In one embodiment, the nucleotide base is an 
analogue of guanine. In one embodiment, the nucleotide 
base is an analogue of cytosine. In one embodiment, the 
nucleotide base is an analogue of thymine. In one 
embodiment, the nucleotide base is an analogue of 
uracil . 

In different embodiments of any of the inventions 
described herein, the solid surface is glass, silicon, 
or gold in different embodiments, the solid surface is 
a magnetic bead, a chip, a channel in a chip, or a 
- porour^anriel-- in a chip. In one embodiment, the solid 
surface is glass. In one embodiment, the solid surface 
is silicon, in one embodiment, the solid surface is 
gold in one embodiments, the solid surface is a 

magnetic bead. In one embodiment, the solid surface is a 
chip in one embodiment, the solid surface is a channel 
in a chip, in one embodiment, the solid surface is a 
porous channel in a chip. Other materials can also be 
• 4**r*r***~ <*• M*t*r*al does not interfere with the 
steps of the method. 
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the solid surface is a ribonucleic acid (RNA) , and the 
polymerase in step (iii) is reverse transcriptase - 

In one embodiment, the primer is attached to a 3' end of 
5 the nucleic acid in step (ii) , and the attached primer 

comprises a stable loop and an -OH group at a 3'- 
position of a deoxyribose capable of self-priming in the 
polymerase reaction. In one embodiment, the step of 
attaching the primer to the nucleic acid comprises 
10 hybridizing the primer to the nucleic acid or ligating 

the primer to the nucleic acid. In one embodiment, the 
primer is attached to the nucleic acid through a 
ligation reaction which links the 3' end of the nucleic 
acid with the 5' end of the primer. 

15 

In one embodiment, one or more of four different 
nucleotide analogs is added in step (iii) , wherein each 
different nucleotide analogue comprises a different base 
selected from the group consisting of thymine or uracil 
20 or an analogue of thymine or uracil, adenine or an 

analogue of adenine, cytosine or an analogue of 
cytosine, and guanine or an analogue of guanine, and 
- wherein each of the four different nucleotide analogues 
comprises a unique label. 

25 

In one embodiment, the cleavable chemical group that 
caps the -OH group at the 3' -position of the deoxyribose 
in the nucleotide analogue is -CH 2 OCH 3 or -CH 2 CH=CH 2 - Any 
chemical group could be used as long as the group 1) is 

. . -3 0 -"stable ^-during the polymerase., -reaction, 2) -does— not 

interfere with the recognition of the nucleotide 



In one embodiment, the unique label that is attached to 
the nucleotide analogue is a mass tag that can be 
detected and differentiated by a mass spectrometer. In 
further embodiments, the mass tag is selected from the 
group consisting of a 2-nitro-ot-methyl-benzyl group, a 
2-nitro-a-methyl-3-f luorobenzyl group, a 2-nitro-a- 
methyl-3, 4-dif luorobenzyl group, and a 2-nitro-a-methyl- 
3 , 4-dimethoxybenzyl group. In one embodiment, the mass 
tag is a 2-nitro-a-methyl-benzyl group. In one 
embodiment, the mass tag is a 2-nitro-a-methyl-3- 
f luorobenzyl group. In one embodiment, the mass tag is a 
2-nitro-cx-methyl-3, 4-dif luorobenzyl group. In one 
embodiment, the mass tag is a 2-nitro-ot-methyl-3 , 4- 
dimethoxybenzyl group. In one embodiment, the mass tag 
3 s detected using a parallel mass spectrometry system 
which comprises a plurality of atmospheric pressure 
chemical ionization mass spectrometers for parallel 
analysis of a plurality of samples comprising mass tags. 

In one embodiment, the unique label is attached through 
a cleavable linker to a 5-position of cytosine or 
thymine or to a 7-position of deaza-adenine or deaza- 
guanine . The unique label could also be attached 
through a cleavable linker to another position in the 
nucleotide analogue as long as the attachment of the 
label is stable during the polymerase reaction and the 
nucleotide analog can be recognized by polymerase as a 
substrate. For example, the cleavable label could be 
attached to the- deGxy-ribose". : - 
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different dideoxynucleotides are selected from the group 
consisting of 2 ' , 3' -dideoxyadenosine 5' -triphosphate, 
2 ' , 3 ' -dideoxyguanosine 5' -triphosphate , 2 ' , 3 ' - 

dideoxycytidine 5 ' -triphosphate, 2' , 3' -dideoxythymidine 
5 5' -triphosphate, 2' , 3' -dideoxyuridine 5' -triphosphase, 

and their analogues. In one embodiment, the 

dideoxynucleotide is 2 3 ' -dideoxyadenosine 5'- 

triphosphate . In one embodiment, the dideoxynucleotide 
is 2 3 ' -dideoxyguanosine 5 ' -triphosphate . In one 

10 embodiment, the dideoxynucleotide is 2', 3'- 

dideoxycytidine 5' -triphosphate . In one embodiment, the 
dideoxynucleotide is 2 3' -dideoxythymidine 5'- 

triphosphate - In one embodiment, the dideoxynucleotide 
is 2 ', 3 ' -dideoxyuridine 5 ' -triphosphase . In one 

15 embodiment, the dideoxynucleotide is an analogue of 

2 ', 3' -dideoxyadenosine 5' -triphosphate . In one 

embodiment, the dideoxynucleotide is an analogue of 
2 ', 3' -dideoxyguanosine 5 ' -triphosphate . In one 

embodiment, the dideoxynucleotide is an analogue of 

20 2 ' , 3 ' -dideoxycytidine 5 ' -triphosphate . In one 

embodiment, the dideoxynucleotide is an analogue of 
2 ', 3 ' -dideoxythymidine 5 ' -triphosphate . In one 

pmbodiment, the dideoxynucleotide " is an analogue of 
2 ' , 3' -dideoxyuridine 5' -triphosphase . 

25 

In one embodiment, a polymerase and one or more of four 
different dideoxynucleotides are added in step (vi) , 
wherein each different dideoxynucleotide is selected 
from the' group consisting of 2' , 3' -dideoxyadenosine 5'- 
30 - triphosphate- or an- analogue of —2-' , 3' -dideoxyadenosine 
5 ' -triphosphate; 2 ' , 3 ' -dideoxyguanosine 5 9 -triphosphate 
or an analogue of 2 3 ' -dideoxyguanosine 5'- 



The invention provides a method for simultaneously 
sequencing a plurality of different nucleic acids, which 
comprises simultaneously applying any of the methods 
disclosed herein for sequencing a nucleic acid to the 
plurality of different nucleic acids. In different 
embodiments, the method can be used to sequence from one 
to over 100,000 different nucleic acids simultaneously. 

The invention provides for the use of any of the methods 
disclosed herein for detection of single nucleotide 
polymorphisms, genetic mutation analysis, serial 
analysis of gene expression, gene expression analysis, 
identification in forensics, genetic disease association 
studies, DNA sequencing, genomic sequencing, 

translational analysis, or transcriptional analysis. 

The invention provides a method of attaching a nucleic 
acid to a solid surface which comprises: 

(i) coating the solid surface with a phosphine 

moiety, 

~(ii)" —attaching an azido group to a 5' end of the 
nucleic acid, and 

(iii) immobilizing the 5' end of the nucleic acid 
to the solid surface through interaction 
between the phosphine moiety on the solid 
surface and the azido group on the 5' end of 

- ^ .-.fehev nucleic:, acid ... . — .-~ - 
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(a) a base selected from the group consisting of 
adenine or an analogue of adenine, cytosine or 
an analogue of cytosine, guanine or an 
analogue of guanine, thymine or an analogue of 
thymine, and uracil or an analogue of uracil; 



(b) a unique label attached through a cleavable 
linker to the base or to an analogue of the 
base; 



(c) a deoxyribose; and 



(d) a cleavable chemical group to cap an -OH group 
at a 3' -position of the deoxyribose. 



In one embodiment of the nucleotide analogue, the 
cleavable chemical group that caps the -OH group at the 
3' -position of the deoxyribose is -CH2OCH3 or 
-CH 2 CH=CH 2 . 

In one embodiment, the unique label is a fluorescent 
moiety or a fluorescent semiconductor crystal. In 
'-further- embodiments , the fluorescent moiety is selected 
from the group consisting of 5-carboxyf luorescein, 6^ 
carboxyrhodamine-6G, N, N, N 1 , N 1 -tetramethyl-6- 

carboxyrhodamine, and 6-carboxy-X-rhodamine . 



In one embodiment, the unique label is a fluorescence 
energy transfer tag which comprises an energy transfer 
donor- — and— an > energy transfer acceptor In further 

embodiments, the energy transfer donor is 5- 
carboxyf luorescein or cyanine, and wherein the energy 
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In one embodiment, the cleavable chemical group used to 
cap the -OH group at the 3' -position of the deoxyribose 
is cleavable by a means selected from the group 
consisting of one or more of a physical means, a 
chemical means, a physical chemical means, heat, and 
light . 
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In different embodiments, the nucleotide 
selected from the group consisting of: 

,o. 



analogue is 



o o o 

-0 -Ji-O-^-O-f^-O 
* 1 ' 

O* O' o* 



COOH 




OR 



10 



15 



0 o o 
I-o-p-o-p-o-p-o 

1 * * 

o- o- o* 



H,CH 2 CHN ^y^^f NHCH ^ 3 




coo- 



OR 



20 



(H 3 C)2N 



O O O H 2 N 

•o-P-o-f-o-^-o 



o- cr O' 





coo- 



and 



25 



o o • o 

• • • . 
o- 0" o 



HN 





OR 



30 



wherein R is -CH 2 OCH 3 or -CH 2 CH=CH 2 . 
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In different embodiments, the nucleotide analogue is 
selected from the group consisting of: 





30 



wherein R is -CH 2 OCH 3 or 



-CH 2 CH=CH 2 
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This invention will be better understood from the 
Experimental Details which follow. However, one skilled 
in the art will readily appreciate that the specific 
methods and results discussed are merely illustrative of 
the invention as described more fully in the claims 
which follow thereafter. 
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(DMSO) /NaHC0 3 (pH=8.2) overnight at room temperature 
produces PC-LC-Biotin-FAM which is composed of a biotin 
at one end, a photocleavable 2-nitrobenzyl group in the 



photocleavable moiety closely mimics the designed 
photocleavable nucleotide analogues shown in Figure 10. 
Thus the successful photolysis of the PC— LC-Biotin-FAM 
moiety provides proof of the principle of high 
efficiency photolysis as used in the DNA sequencing 
system. For photolysis study, PC-LC-Biotin-FAM is first 
immobilized on a microscope glass slide coated with 
streptavidin (XENOPORE, Hawthorne NJ) . After washing 
off the non-immobilized PC-LC-Biotin-FAM, the 
fluorescence emission spectrum of the immobilized PC-LC- 
Biotin-FAM was taken as shown in Figure 12 (Spectrum a) . 
The strong fluorescence emission indicates that PC-LC- 
Biotin-FAM is successfully immobilized to the 
streptavidin coated slide surface. The 

photocleavability of the 2-nitrobenzyl linker by 
irradiation at 350 nm was then tested. After 10 minutes 
of photolysis (X± rr = 350 nm; -0.5 mW/cm ) and before any 
washing, the fluorescence emission spectrum of the same 
spot on the slide was taken that showed no decrease in 
intensity (Figure 12, Spectrum b) , indicating that the 
dye (FAM) was not bleached during the photolysis process 
at 350 nm. After washing the glass slide with HPLC 
water following photolysis, the fluorescence emission 
spectrum of the same spot on the slide showed 
significant intensity decrease (Figure 12, Spectrum c) 
which indicates that most of the fluorescence dye (FAM) 
was cleaved from the immobilized biotin moiety and was 
removed by the washing procedure. This experiment shows 



middle, and a dye tag (FAM) at the other end. 



This 
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The ET primer and ET dideoxynucleotides have been shown 
to be a superior set of reagents for 4-color DNA 
sequencing that allows the use of one laser to excite 
multiple sets of fluorescent tags (Ju et al . 1995). It 
has been shown that DNA polymerase (Thermo Sequenase and 
Taq FS) can efficiently incorporate the ET dye labeled 
dideoxynucleotides (Rosenblum et al . 1997). These ET 
dye-labeled sequencing reagents are now widely used in 
large scale DNA sequencing projects, such as the human 
genome project . A library of ET dye labeled nucleotide 
analogues can be synthesized as shown in Figure 15 for 
optimization of the DNA sequencing system. The ET dye 
set (FAM-C1 2 FAM, FAM-C1 2 R6G, FAM-C1 2 TAM, FAM-C1 2 R0X) using 
FAM as a donor and dichloro ( FAM, R6G, TAM, ROX) as 
acceptors has been reported in the literature (Lee et 
al . 1997) and constitutes a set of commercially 
available DNA sequencing reagents. These ET dye sets 
have been proven to produce enhanced fluorescence 
intensity, and the nucleotides labeled with these ET 
dyes at the 5-position of T and C and the 7-position of 
G and A are excellent substrates of DNA polymerase. 
Alternatively, an ET dye set can be constructed using 
cyanine (Cy2) as a donor and C1 2 FAM, C1 2 R6G, C1 2 TAM, or 
Cl 2 ROX as energy acceptors. Since Cy2 possesses higher 
molar absorbance compared with the rhodamine and 
fluorescein derivatives, an ET system using Cy2 as a 
donor produces much stronger fluorescence signals than 
the system using FAM as a donor (Hung et al . 1996) . 
Figure 16 shows a synthetic scheme for an ET dye labeled 
nucleotide analogue with Cy2 as a donor- and C1 2 FAM- as an 
acceptor using similar coupling chemistry as for the 
synthesis of an energy transfer system using FAM as a 



and Tag4 are four different unique cleavable mass tags. 
Four specific examples of nucleotide analogues are shown 
in Figure 19. In Figure 19, "R" is H when the 3' -OH 
group is not capped. As discussed above, the photo 
cleavable 2-nitro benzyl moiety has been used to link 
biotin to DNA and protein for efficient removal by UV 
light 350 nm) irradiation (Olejnik et al. 1995, 

1999) . Four different 2-nitro benzyl . groups with 

different molecular weights as mass tags are used to 
form the mass tag labeled nucleotides as shown in Figure 
19: 2-nitro-a-methyl-benzyl (Tag-1) codes for A; 2- 
nitro-a-methyl-3-f luorobenzyl (Tag-2) codes for C; 2- 
nitro-a-methyl-3 , 4-dif luorobenzyl (Tag-3) codes for G; 
2-nitro-oc-methyl-3 , 4-dimethoxybenzyl (Tag-4) codes for 
T . 

As a representative example, the synthesis of the NHS 
ester of one mass tag (Tag-3) is shown in Figure 20. A 
similar scheme is used to create the other mass tags. 
The synthesis of 3' -Ho-G^Taga is shown in Figure 21 using 
well-established procedures (Prober et al . 1987; Lee et 
al. 1992 and Hobbs et al . 1991). 7-propargylamino- dGTP 
is first prepared by reacting 7-I-dGTP with N- 
trif luoroacetylpropargyl amine, which is then coupled 
with the NHS-Tag-3 to produce 3'-ho~G- Tag3. The 
nucleotide analogues with a free 3 1 -OH are good 
substrates for the polymerase. 

The sequencing by synthesis approach can be tested using 
mass tags- using- a' scheme similar to that show for dyes 
in Figure 9. A DNA template containing a portion of 
nucleotide sequence that has no repeated sequences after 
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chemically with high yield as shown in Figure 14 
(Ireland, et al . 1986; Kamal et al . 1999). The chemical 
cleavage of the MOM and allyl groups is fairly mild and 
specific, so as not to degrade the DNA template moiety . 

7 . Parallel Channel System for Sequencing by Synthesis 
Figure 23 illustrates an example of a parallel channel 
system. The system can be used with mass tag labels as 
shown and also with dye labels. A plurality of channels 
in a silica glass chip are connected on each end of the 
channel to a well in a well plate. In the example shown 
there are 96 channels each connected to its own wells. 
The sequencing system also permits a number of channels 
other than 96 to be used. 96 channel devices for 
separating DNA sequencing and sizing fragments have been 
reported (Woolley and Mathies 1994, Woolley et al . 1997, 
Simpson et al . 1998) . The chip is made by 
photolithographic masking and chemical etching 
techniques. The photolithographically defined channel 
patterns are etched in a silica glass substrate, and 
then capillary channels (id ~ 100 ]im) are formed by 
thermally bonding the etched substrate to a second 
silica " glass slide. Channels are porous to increase 
surface area. The immobilized single stranded DNA 
template chip is prepared according to the scheme shown 
in Figure 3. Each channel is first treated with 0.5 M 
NaOH, washed with water, and is then coated with high 
density 3-aminopropyltrimethoxysilane in aqueous ethanol 
(Woolley et al . 1994) forming a primary amine surface. 
Succinimidyl (*NHS) " "ester- of triarylphosphi-ne- - (1) - -is 
covalently coupled with the primary amine group 
converting the amine surface to a novel triarylphosphine 



To make mass spectrometry competitive with a 96 
capillary array method for analyzing DNA, a parallel 
mass spectrometer approach is needed. Such a complete 
system has not been reported mainly due to the fact that 
most of the mass spectrometers are designed to achieve 
adequate resolution for large biomolecules. The system 
disclosed herein requires the detection of four mass 
tags, with molecular weight range between 150 and 250 
daltons, coding for the identity of the four nucleotides 
(A, C, G, T) . Since a mass spectrometer dedicated to 
detection of these mass tags only requires high 
resolution for the mass range of 150 to 250 daltons 
instead of covering a wide mass range, the mass 
spectrometer can be miniaturized and have a simple . 
design- Either quadrupole (including ion trap detector) 
or time-of -flight mass spectrometers can be selected for 
the ion optics . While modern mass spectrometer 

technology has made it possible to produce miniaturized 
mass spectrometers, most current research has focused on 
the design of a single stand-alone miniaturized mass 
spectrometer. Individual components of the mass 

spectrometer has been miniaturized for enhancing "the 
mass spectrometer analysis capability (Liu et al . 2000, 
Zhang et al . 1999). A miniaturized mass spectrometry 
system using multiple analyzers (up to 10) in parallel 
has been reported (Badman and Cooks 2000) . However, 
the mass spectrometer of Badman and Cook was designed to 
measure only single samples rather than multiple samples 
-in parallel; - -They also noted that -the -miniaturization - 
of the ion trap limited the capability of the mass 
spectrometer to scan wide mass ranges. Since the 
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scanning mode of mass spectrometers are the same for 
each miniaturized mass spectrometer, one power supply 
for each analyzer and the ionization source can provide 
the necessary power for all three instruments. One 
power supply for each of the three independent detectors 
is used for spectrum collection. The data obtained are 
transferred to three independent A/D converters and 
processed by the data system simultaneously to identify 
the mass tag in the injected sample and thus identify 
the nucleotide. Despite containing three mass 

spectrometers, the entire device is able to fit on a 
laboratory bench top. 

9. Validate the Complete Sequencing by Synthesis System 
By Sequencing P53 Genes 

The tumor suppressor gene p53 can be used as a model 
system to validate the DNA sequencing system. The p53 
gene is one of the most frequently mutated genes in 
human cancer (O'Connor et al . 1997) . First, a base pair 
DNA template (shown below) is synthesized containing an 
azido group at the 5 1 end and a portion of the sequences 
from exon 7 and exon 8 of the p53 gene: 

5 ' -N 3 -TTCCTGCATGGGCGGCATGAACCCGAGGCCCATCCTCACCATCATCAC 
ACTGGAAGACTCCAGTGGTAATCTACTGGGACGGAACAGCTTTGAGGTGCATT 
-3' (SEQ ID NO: 2) . 

This template is chosen to explore the use of the 
sequencing system for the detection of clustered hot 
spot single base mutations. The potentially mutated 
bases are underlined (A, G, C and T) in the synthetic 
template. The synthetic template is immobilized on a 
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What is claimed is: 



1- A method for sequencing a nucleic acid by detecting 
the identity of a nucleotide analogue after the 
nucleotide analogue is incorporated into a growing 
strand of DNA in a polymerase reaction, which 
comprises the following steps: 

(i) attaching a 5' end of the nucleic acid to a 
solid surface; 

(ii) attaching a primer to the - nucleic acid 
attached to the solid surface; 

(iii) adding a polymerase and one or more different 
nucleotide analogues to the nucleic acid to 
thereby incorporate a nucleotide analogue into 
the growing * strand , of DNA, wherein the 
incorporated nucleotide analogue terminates 
the polymerase reaction and wherein each 
different nucleotide analogue comprises (a) a 
base selected from the group consisting of 
adenine, guanine, cytosine, thymine, and 
uracil, and their analogues; (b) a unique 
label attached through a cleavable linker to 
the base or to an analogue of the base; (c) a 
deoxyribose; and (d) a cleavable chemical 
group to cap an -OH group at a 3' -position of 
the deoxyribose; 



(iv) washing the solid surface, to remove 
unincorporated nucleotide analogues; 
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wherein if the unique label is a dye, the order of 
steps (v) through (vii) is: (v) , <vi) , and (vii); 
and 

wherein if the unique label is a mass tag, the 
order of steps (v) through (vii) is: (vi), (vii), 
and (v) . 

The method of claim 1, wherein the solid surface is 
glass, silicon, or gold. 

The method of claim 1, .wherein the solid surface is 
a magnetic bead, a chip, a channel in a chip, or a 
porous channel in a chip. 

The method of claim 1, wherein the step of 
attaching the nucleic acid to the solid surface 
comprises : 

(i) coating the solid surface with a phosphine 
moiety, 

(ii) attaching an azido group to the 5' end of the 
nucleic acid, and 

(iii) immobilizing the 5' end of the nucleic acid 
to the solid surface through interaction 
between the phosphine moiety on the solid 
surface and the azido group on the 5' end of 

- the * -n-U'cieie- acieh • - - _ r __ 



deoxyribose capable 
polymerase reaction . 
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of 



self -priming 



in the 



The method of claim 1, wherein the step of 
attaching the primer to the nucleic acid comprises 
hybridizing the primer to the nucleic acid or 
ligating the primer to the nucleic acid. 



The method of claim 1, wherein one or more of four 
different nucleotide analogues is added in step 
(iii) , wherein each different nucleotide analogue 
comprises, a different base selected from the group 
consisting of thymine or uracil or an analogue of 
thymine or uracil, adenine or an analogue of 
adenine, cytosine or an analogue of cytosine, and 
guanine or an analogue of guanine, and wherein each 
of the four different nucleotide analogues 
comprises a unique label. 

The method of claim 1, wherein the cleavable 
chemical group that caps the -OH group at the 3'- 
position of the deoxyribose in the nucleotide 
analogue" is -CH 2 OCH 3 or -CH 2 CH=CH 2 . 



The method of claim 1, 
that is attached to the 
fluorescent moiety or a 
crystal . 



wherein the unique label 
nucleotide analogue is a 
fluorescent semiconductor 



¥He ' mefhdd of claim ""13, wherein "the "fluorescent 
moiety is selected from the group consisting of 5- 
carboxyfluorescein, 6-carboxyrhodamine-6G, 
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of cytosine or thymine or to a 7-position of deaza- 
adenine or deaza-guanine . 



10 



20. The method of claim 1, wherein the cleavable linker 
between the unique label and the nucleotide 
analogue is cleaved by a means selected from the 
group consisting of one or more of a physical 
means, a chemical means, a physical chemical means, 
heat, and light. 

21. The method of claim 20, wherein the cleavable 
linker is a photocleavable linker which comprises a 
2-nitrobenzyl moiety. 



15 22. The method of claim 1, wherein the cleavable 

chemical group used to cap the -OH group at the 3'- 
position of the deoxyribose is cleaved by a means 
selected from the group consisting of one or more 
of a physical means, a chemical means, a physical 

20 chemical means, heat, and light. 



23. The method of claim 1, wherein the chemical 
~" ~ compounds added in step (vi) to permanently cap any 

unreacted -OH group on the primer attached to the 
25 nucleic acid or on the primer extension strand are 

a polymerase and one or more different 
dideoxynucleotides or analogues of 

dideoxynucleotides . 

v^3 0 24. The method of claim— ■ 2 3i,-_ . wherein the- -different 

dideoxynucleotides are selected from the group 
consisting of 2 ' , 3 ' -dideoxyadenosine 5'- 
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siraultaneously applying the method of claim 1 to 
the plurality of different nucleic acids. 

Use of the method of claim 1 or 27 for detection of 
single nucleotide polymorphisms, genetic mutation 
analysis, serial analysis of gene expression, gene 
expression analysis, identification in forensics, 
genetic disease association studies, DNA 
sequencing, genomic sequencing, translational 
analysis, or transcriptional analysis. 

A method of attaching a nucleic acid to - a solid 
surface which comprises: 

(i) coating the solid surface with a phosphine 
moiety, 

(ii) attaching an azido group to a 5' end of the 
nucleic acid, and 

(iii) immobilizing the 5' . end of the nucleic acid 
to the solid surface through interaction 
between the phosphine moiety on Che solid 
surface and the azido group on the 5' end of 
the nucleic acid. 

The method of claim 29, wherein the step of coating 
the solid surface with the phosphine moiety 
comprises : 

(i) coating the surface with a primary amine, 

and 
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(a) a base selected from the group consisting of 
adenine or an analogue of adenine, cytosine or 
an analogue of cytosine, guanine or an 
analogue of guanine, thymine or an analogue of 
thymine, and uracil or an analogue of uracil; 



(b) a unique label attached through a cleavable 
linker to the base or to an analogue of the 
base; 

(c) a deoxyribose; and 

(d) a cleavable chemical group to cap an -OH group 
at a 3' -position of the deoxyribose. 

The nucleotide analogue of claim 37, wherein the 

cleavable chemical group that caps the -OH group at 

the 3' -position of the deoxyribose is -CH2OCH3 or 
— CH2CH^CH2 • 



The nucleotide analogue of claim 37, wherein the 
unique label is a fluorescent moiety or a 
fluorescent semiconductor crystal . 



The nucleotide analogue of claim 39, wherein the 
fluorescent moiety is selected from the group 
consisting of 5-carboxyf luorescein, 6- 

carboxyrhodamine-6G, N,N,N' , N ' -tetramethyl-6- 

carboxyrhodamine, and 6-carboxy-X-rhodamine . 

The nucleotide analogue of claim 37, wherein the 
unique label is a fluorescence energy transfer tag 
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means, a chemical means, a physical chemical means , 
heat, and light. 



The nucleotide analogue of claim 4 6, wherein the 
cleavable linker is a photocleavable linker which 
comprises a 2-nitrobenzyl moiety. 



The nucleotide analogue c 
cleavable chemical group \ 
at the 3' -position of the 
by a means selected from 
one or more of a physical 
a physical chemical means, 



f claim 37, wherein the 
sed to cap the -OH group 
deoxyribose is cleavable 
the group consisting of 
means, a chemical means, 
heat, and light. 
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The nucleotide analogue of claim 4 9 , wherein the 

nucleotide analogue is selected from the group 
consisting of: 




wherein R is -CH 2 OCH 3 or -CH 2 CH=CH 2 . 
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The nucleotide analogue of claim 51 , wherein the 

nucleotide analogue is selected from the group 
consisting of: 




OR 

O 




wherein R is -CH 2 OCH 3 or -CH 2 CH=CH 2 



The system of claim 54 , wherein the mass tags have 
molecular weights between 150 daltons and 250 
daltons . 



Use of the system of claim 54 for DNA sequencing 
analysis, detection of single nucleotide 
polymorphisms, genetic mutation analysis, serial 
analysis of gene expression, gene expression 
analysis, identification in forensics, genetic 
disease association studies, DNA sequencing, 
genomic sequencing, translational analysis, or 
transcriptional analysis . 
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